Dataset statistics
| Number of variables | 26 |
|---|---|
| Number of observations | 8161 |
| Missing cells | 2405 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.6 MiB |
| Average record size in memory | 208.0 B |
Variable types
| NUM | 11 |
|---|---|
| CAT | 11 |
| BOOL | 4 |
INCOME has a high cardinality: 6612 distinct values | High cardinality |
HOME_VAL has a high cardinality: 5106 distinct values | High cardinality |
BLUEBOOK has a high cardinality: 2789 distinct values | High cardinality |
OLDCLAIM has a high cardinality: 2857 distinct values | High cardinality |
YOJ has 454 (5.6%) missing values | Missing |
INCOME has 445 (5.5%) missing values | Missing |
HOME_VAL has 464 (5.7%) missing values | Missing |
JOB has 526 (6.4%) missing values | Missing |
CAR_AGE has 510 (6.2%) missing values | Missing |
INDEX has unique values | Unique |
TARGET_AMT has 6008 (73.6%) zeros | Zeros |
KIDSDRIV has 7180 (88.0%) zeros | Zeros |
HOMEKIDS has 5289 (64.8%) zeros | Zeros |
YOJ has 625 (7.7%) zeros | Zeros |
CLM_FREQ has 5009 (61.4%) zeros | Zeros |
MVR_PTS has 3712 (45.5%) zeros | Zeros |
Reproduction
| Analysis started | 2021-01-31 19:47:13.393764 |
|---|---|
| Analysis finished | 2021-01-31 19:47:43.644006 |
| Duration | 30.25 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 8161 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5151.867663 |
|---|---|
| Minimum | 1 |
| Maximum | 10302 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 509 |
| Q1 | 2559 |
| median | 5133 |
| Q3 | 7745 |
| 95-th percentile | 9791 |
| Maximum | 10302 |
| Range | 10301 |
| Interquartile range (IQR) | 5186 |
Descriptive statistics
| Standard deviation | 2978.893962 |
|---|---|
| Coefficient of variation (CV) | 0.57821632 |
| Kurtosis | -1.20298272 |
| Mean | 5151.867663 |
| Median Absolute Deviation (MAD) | 2591 |
| Skewness | 0.002004613662 |
| Sum | 42044392 |
| Variance | 8873809.235 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 9566 | 1 | < 0.1% | |
| 3395 | 1 | < 0.1% | |
| 5448 | 1 | < 0.1% | |
| 7497 | 1 | < 0.1% | |
| 1354 | 1 | < 0.1% | |
| 3403 | 1 | < 0.1% | |
| 9550 | 1 | < 0.1% | |
| 5456 | 1 | < 0.1% | |
| 7505 | 1 | < 0.1% | |
| Other values (8151) | 8151 | 99.9% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 10302 | 1 | < 0.1% | |
| 10301 | 1 | < 0.1% | |
| 10299 | 1 | < 0.1% | |
| 10298 | 1 | < 0.1% | |
| 10297 | 1 | < 0.1% |
TARGET_FLAG
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 6008 | 73.6% | |
| 1 | 2153 | 26.4% |
| Distinct | 1949 |
|---|---|
| Distinct (%) | 23.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1504.324648 |
|---|---|
| Minimum | 0 |
| Maximum | 107586.1362 |
| Zeros | 6008 |
| Zeros (%) | 73.6% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1036 |
| 95-th percentile | 6452 |
| Maximum | 107586.1362 |
| Range | 107586.1362 |
| Interquartile range (IQR) | 1036 |
Descriptive statistics
| Standard deviation | 4704.02693 |
|---|---|
| Coefficient of variation (CV) | 3.127002496 |
| Kurtosis | 112.3862763 |
| Mean | 1504.324648 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 8.70950474 |
| Sum | 12276793.45 |
| Variance | 22127869.36 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 6008 | 73.6% | |
| 2327 | 4 | < 0.1% | |
| 5453 | 3 | < 0.1% | |
| 2546 | 3 | < 0.1% | |
| 3501 | 3 | < 0.1% | |
| 3667 | 3 | < 0.1% | |
| 5728 | 3 | < 0.1% | |
| 5692 | 3 | < 0.1% | |
| 3350 | 3 | < 0.1% | |
| 980 | 3 | < 0.1% | |
| Other values (1939) | 2125 | 26.0% |
| Value | Count | Frequency (%) | |
| 0 | 6008 | 73.6% | |
| 30.27728015 | 1 | < 0.1% | |
| 58.53106231 | 1 | < 0.1% | |
| 95.56731717 | 1 | < 0.1% | |
| 108.7414986 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 107586.1362 | 1 | < 0.1% | |
| 85523.65335 | 1 | < 0.1% | |
| 78874.19056 | 1 | < 0.1% | |
| 77907.43028 | 1 | < 0.1% | |
| 73783.46592 | 1 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1710574684 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 7180 |
| Zeros (%) | 88.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5115340939 |
|---|---|
| Coefficient of variation (CV) | 2.99042245 |
| Kurtosis | 11.79177272 |
| Mean | 0.1710574684 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.353069928 |
| Sum | 1396 |
| Variance | 0.2616671292 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 7180 | 88.0% | |
| 1 | 636 | 7.8% | |
| 2 | 279 | 3.4% | |
| 3 | 62 | 0.8% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 7180 | 88.0% | |
| 1 | 636 | 7.8% | |
| 2 | 279 | 3.4% | |
| 3 | 62 | 0.8% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4 | 4 | < 0.1% | |
| 3 | 62 | 0.8% | |
| 2 | 279 | 3.4% | |
| 1 | 636 | 7.8% | |
| 0 | 7180 | 88.0% |
AGE
Real number (ℝ≥0)
| Distinct | 60 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 6 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 44.79031269 |
|---|---|
| Minimum | 16 |
| Maximum | 81 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 16 |
|---|---|
| 5-th percentile | 30 |
| Q1 | 39 |
| median | 45 |
| Q3 | 51 |
| 95-th percentile | 59 |
| Maximum | 81 |
| Range | 65 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 8.627589456 |
|---|---|
| Coefficient of variation (CV) | 0.1926217733 |
| Kurtosis | -0.06028251742 |
| Mean | 44.79031269 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.02899961562 |
| Sum | 365265 |
| Variance | 74.43529982 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 46 | 401 | 4.9% | |
| 45 | 376 | 4.6% | |
| 48 | 363 | 4.4% | |
| 47 | 355 | 4.3% | |
| 43 | 351 | 4.3% | |
| 41 | 336 | 4.1% | |
| 44 | 336 | 4.1% | |
| 42 | 333 | 4.1% | |
| 50 | 329 | 4.0% | |
| 40 | 317 | 3.9% | |
| Other values (50) | 4658 | 57.1% |
| Value | Count | Frequency (%) | |
| 16 | 5 | 0.1% | |
| 17 | 1 | < 0.1% | |
| 18 | 3 | < 0.1% | |
| 19 | 5 | 0.1% | |
| 20 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 81 | 1 | < 0.1% | |
| 80 | 1 | < 0.1% | |
| 76 | 1 | < 0.1% | |
| 73 | 3 | < 0.1% | |
| 72 | 3 | < 0.1% |
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.7212351428 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 5289 |
| Zeros (%) | 64.8% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.116323291 |
|---|---|
| Coefficient of variation (CV) | 1.547793812 |
| Kurtosis | 0.6510197811 |
| Mean | 0.7212351428 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.341620234 |
| Sum | 5886 |
| Variance | 1.24617769 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 5289 | 64.8% | |
| 2 | 1118 | 13.7% | |
| 1 | 902 | 11.1% | |
| 3 | 674 | 8.3% | |
| 4 | 164 | 2.0% | |
| 5 | 14 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 5289 | 64.8% | |
| 1 | 902 | 11.1% | |
| 2 | 1118 | 13.7% | |
| 3 | 674 | 8.3% | |
| 4 | 164 | 2.0% |
| Value | Count | Frequency (%) | |
| 5 | 14 | 0.2% | |
| 4 | 164 | 2.0% | |
| 3 | 674 | 8.3% | |
| 2 | 1118 | 13.7% | |
| 1 | 902 | 11.1% |
| Distinct | 21 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 454 |
| Missing (%) | 5.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.49928636 |
|---|---|
| Minimum | 0 |
| Maximum | 23 |
| Zeros | 625 |
| Zeros (%) | 7.7% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 9 |
| median | 11 |
| Q3 | 13 |
| 95-th percentile | 15 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 4.092474183 |
|---|---|
| Coefficient of variation (CV) | 0.3897859379 |
| Kurtosis | 1.179968995 |
| Mean | 10.49928636 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -1.203436037 |
| Sum | 80918 |
| Variance | 16.74834494 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 12 | 1158 | 14.2% | |
| 13 | 1016 | 12.4% | |
| 11 | 1003 | 12.3% | |
| 14 | 785 | 9.6% | |
| 10 | 749 | 9.2% | |
| 0 | 625 | 7.7% | |
| 9 | 521 | 6.4% | |
| 15 | 463 | 5.7% | |
| 8 | 384 | 4.7% | |
| 7 | 300 | 3.7% | |
| Other values (11) | 703 | 8.6% | |
| (Missing) | 454 | 5.6% |
| Value | Count | Frequency (%) | |
| 0 | 625 | 7.7% | |
| 1 | 6 | 0.1% | |
| 2 | 15 | 0.2% | |
| 3 | 36 | 0.4% | |
| 4 | 37 | 0.5% |
| Value | Count | Frequency (%) | |
| 23 | 2 | < 0.1% | |
| 19 | 12 | 0.1% | |
| 18 | 25 | 0.3% | |
| 17 | 101 | 1.2% | |
| 16 | 204 | 2.5% |
| Distinct | 6612 |
|---|---|
| Distinct (%) | 85.7% |
| Missing | 445 |
| Missing (%) | 5.5% |
| Memory size | 63.8 KiB |
| $0 | 615 |
|---|---|
| $61,790 | 4 |
| $48,509 | 4 |
| $26,840 | 4 |
| $47,513 | 3 |
| Other values (6607) |
| Value | Count | Frequency (%) | |
| $0 | 615 | 7.5% | |
| $61,790 | 4 | < 0.1% | |
| $48,509 | 4 | < 0.1% | |
| $26,840 | 4 | < 0.1% | |
| $47,513 | 3 | < 0.1% | |
| $183,296 | 3 | < 0.1% | |
| $143,073 | 3 | < 0.1% | |
| $7,971 | 3 | < 0.1% | |
| $63,357 | 3 | < 0.1% | |
| $50,166 | 3 | < 0.1% | |
| Other values (6602) | 7071 | 86.6% | |
| (Missing) | 445 | 5.5% |
Unique
| Unique | 6156 ? |
|---|---|
| Unique (%) | 79.8% |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.519666708 |
| Min length | 2 |
PARENT1
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| No | |
|---|---|
| Yes |
| Value | Count | Frequency (%) | |
| No | 7084 | 86.8% | |
| Yes | 1077 | 13.2% |
| Distinct | 5106 |
|---|---|
| Distinct (%) | 66.3% |
| Missing | 464 |
| Missing (%) | 5.7% |
| Memory size | 63.8 KiB |
| $0 | |
|---|---|
| $159,568 | 3 |
| $153,061 | 3 |
| $238,724 | 3 |
| $288,592 | 3 |
| Other values (5101) |
| Value | Count | Frequency (%) | |
| $0 | 2294 | 28.1% | |
| $159,568 | 3 | < 0.1% | |
| $153,061 | 3 | < 0.1% | |
| $238,724 | 3 | < 0.1% | |
| $288,592 | 3 | < 0.1% | |
| $123,109 | 3 | < 0.1% | |
| $332,673 | 3 | < 0.1% | |
| $111,129 | 3 | < 0.1% | |
| $166,481 | 3 | < 0.1% | |
| $196,320 | 3 | < 0.1% | |
| Other values (5096) | 5376 | 65.9% | |
| (Missing) | 464 | 5.7% |
Unique
| Unique | 4819 ? |
|---|---|
| Unique (%) | 62.6% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 5.983702978 |
| Min length | 2 |
MSTATUS
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| Yes | |
|---|---|
| z_No |
| Value | Count | Frequency (%) | |
| Yes | 4894 | 60.0% | |
| z_No | 3267 | 40.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.400318588 |
| Min length | 3 |
SEX
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| z_F | |
|---|---|
| M |
| Value | Count | Frequency (%) | |
| z_F | 4375 | 53.6% | |
| M | 3786 | 46.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 2.072172528 |
| Min length | 1 |
EDUCATION
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| z_High School | |
|---|---|
| Bachelors | |
| Masters | |
| <High School | |
| PhD |
| Value | Count | Frequency (%) | |
| z_High School | 2330 | 28.6% | |
| Bachelors | 2242 | 27.5% | |
| Masters | 1658 | 20.3% | |
| <High School | 1203 | 14.7% | |
| PhD | 728 | 8.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 9.642690847 |
| Min length | 3 |
| Distinct | 8 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 526 |
| Missing (%) | 6.4% |
| Memory size | 63.8 KiB |
| z_Blue Collar | |
|---|---|
| Clerical | |
| Professional | |
| Manager | |
| Lawyer | |
| Other values (3) |
| Value | Count | Frequency (%) | |
| z_Blue Collar | 1825 | 22.4% | |
| Clerical | 1271 | 15.6% | |
| Professional | 1117 | 13.7% | |
| Manager | 988 | 12.1% | |
| Lawyer | 835 | 10.2% | |
| Student | 712 | 8.7% | |
| Home Maker | 641 | 7.9% | |
| Doctor | 246 | 3.0% | |
| (Missing) | 526 | 6.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 9.027202549 |
| Min length | 3 |
TRAVTIME
Real number (ℝ≥0)
| Distinct | 97 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33.48572479 |
|---|---|
| Minimum | 5 |
| Maximum | 142 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 22 |
| median | 33 |
| Q3 | 44 |
| 95-th percentile | 60 |
| Maximum | 142 |
| Range | 137 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 15.90833341 |
|---|---|
| Coefficient of variation (CV) | 0.4750780672 |
| Kurtosis | 0.6663746248 |
| Mean | 33.48572479 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 0.4469816868 |
| Sum | 273277 |
| Variance | 253.0750719 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 334 | 4.1% | |
| 35 | 219 | 2.7% | |
| 30 | 219 | 2.7% | |
| 32 | 214 | 2.6% | |
| 25 | 214 | 2.6% | |
| 36 | 211 | 2.6% | |
| 29 | 207 | 2.5% | |
| 33 | 206 | 2.5% | |
| 24 | 204 | 2.5% | |
| 37 | 202 | 2.5% | |
| Other values (87) | 5931 | 72.7% |
| Value | Count | Frequency (%) | |
| 5 | 334 | 4.1% | |
| 6 | 49 | 0.6% | |
| 7 | 43 | 0.5% | |
| 8 | 54 | 0.7% | |
| 9 | 70 | 0.9% |
| Value | Count | Frequency (%) | |
| 142 | 1 | < 0.1% | |
| 134 | 1 | < 0.1% | |
| 124 | 1 | < 0.1% | |
| 113 | 1 | < 0.1% | |
| 103 | 1 | < 0.1% |
CAR_USE
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| Private | |
|---|---|
| Commercial |
| Value | Count | Frequency (%) | |
| Private | 5132 | 62.9% | |
| Commercial | 3029 | 37.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 8.113466487 |
| Min length | 7 |
| Distinct | 2789 |
|---|---|
| Distinct (%) | 34.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| $1,500 | 157 |
|---|---|
| $6,000 | 34 |
| $5,800 | 33 |
| $6,200 | 33 |
| $6,400 | 31 |
| Other values (2784) |
| Value | Count | Frequency (%) | |
| $1,500 | 157 | 1.9% | |
| $6,000 | 34 | 0.4% | |
| $5,800 | 33 | 0.4% | |
| $6,200 | 33 | 0.4% | |
| $6,400 | 31 | 0.4% | |
| $6,100 | 30 | 0.4% | |
| $5,900 | 30 | 0.4% | |
| $6,500 | 29 | 0.4% | |
| $5,400 | 28 | 0.3% | |
| $5,700 | 26 | 0.3% | |
| Other values (2779) | 7730 | 94.7% |
Unique
| Unique | 900 ? |
|---|---|
| Unique (%) | 11.0% |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.717068987 |
| Min length | 6 |
TIF
Real number (ℝ≥0)
| Distinct | 23 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.351304987 |
|---|---|
| Minimum | 1 |
| Maximum | 25 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 4 |
| Q3 | 7 |
| 95-th percentile | 13 |
| Maximum | 25 |
| Range | 24 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.146635309 |
|---|---|
| Coefficient of variation (CV) | 0.7748830088 |
| Kurtosis | 0.4243279295 |
| Mean | 5.351304987 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.891139559 |
| Sum | 43672 |
| Variance | 17.19458439 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 2533 | 31.0% | |
| 6 | 1341 | 16.4% | |
| 4 | 1242 | 15.2% | |
| 10 | 780 | 9.6% | |
| 7 | 620 | 7.6% | |
| 3 | 424 | 5.2% | |
| 13 | 278 | 3.4% | |
| 11 | 242 | 3.0% | |
| 9 | 225 | 2.8% | |
| 17 | 104 | 1.3% | |
| Other values (13) | 372 | 4.6% |
| Value | Count | Frequency (%) | |
| 1 | 2533 | 31.0% | |
| 2 | 6 | 0.1% | |
| 3 | 424 | 5.2% | |
| 4 | 1242 | 15.2% | |
| 5 | 52 | 0.6% |
| Value | Count | Frequency (%) | |
| 25 | 2 | < 0.1% | |
| 22 | 3 | < 0.1% | |
| 21 | 11 | 0.1% | |
| 20 | 8 | 0.1% | |
| 19 | 8 | 0.1% |
CAR_TYPE
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| z_SUV | |
|---|---|
| Minivan | |
| Pickup | |
| Sports Car | |
| Van |
| Value | Count | Frequency (%) | |
| z_SUV | 2294 | 28.1% | |
| Minivan | 2145 | 26.3% | |
| Pickup | 1389 | 17.0% | |
| Sports Car | 907 | 11.1% | |
| Van | 750 | 9.2% | |
| Panel Truck | 676 | 8.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 6 |
| Mean length | 6.564759221 |
| Min length | 3 |
RED_CAR
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| no | |
|---|---|
| yes |
| Value | Count | Frequency (%) | |
| no | 5783 | 70.9% | |
| yes | 2378 | 29.1% |
| Distinct | 2857 |
|---|---|
| Distinct (%) | 35.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| $0 | |
|---|---|
| $4,263 | 4 |
| $1,310 | 4 |
| $1,391 | 4 |
| $3,338 | 3 |
| Other values (2852) |
| Value | Count | Frequency (%) | |
| $0 | 5009 | 61.4% | |
| $4,263 | 4 | < 0.1% | |
| $1,310 | 4 | < 0.1% | |
| $1,391 | 4 | < 0.1% | |
| $3,338 | 3 | < 0.1% | |
| $1,105 | 3 | < 0.1% | |
| $4,567 | 3 | < 0.1% | |
| $3,068 | 3 | < 0.1% | |
| $4,448 | 3 | < 0.1% | |
| $5,289 | 3 | < 0.1% | |
| Other values (2847) | 3122 | 38.3% |
Unique
| Unique | 2592 ? |
|---|---|
| Unique (%) | 31.8% |
Length
| Max length | 7 |
|---|---|
| Median length | 2 |
| Mean length | 3.614753094 |
| Min length | 2 |
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.7985540988 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 5009 |
| Zeros (%) | 61.4% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.158452681 |
|---|---|
| Coefficient of variation (CV) | 1.450687791 |
| Kurtosis | 0.2860042955 |
| Mean | 0.7985540988 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.209242991 |
| Sum | 6517 |
| Variance | 1.342012615 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 5009 | 61.4% | |
| 2 | 1171 | 14.3% | |
| 1 | 997 | 12.2% | |
| 3 | 776 | 9.5% | |
| 4 | 190 | 2.3% | |
| 5 | 18 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 5009 | 61.4% | |
| 1 | 997 | 12.2% | |
| 2 | 1171 | 14.3% | |
| 3 | 776 | 9.5% | |
| 4 | 190 | 2.3% |
| Value | Count | Frequency (%) | |
| 5 | 18 | 0.2% | |
| 4 | 190 | 2.3% | |
| 3 | 776 | 9.5% | |
| 2 | 1171 | 14.3% | |
| 1 | 997 | 12.2% |
REVOKED
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| No | |
|---|---|
| Yes |
| Value | Count | Frequency (%) | |
| No | 7161 | 87.7% | |
| Yes | 1000 | 12.3% |
| Distinct | 13 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.695503002 |
|---|---|
| Minimum | 0 |
| Maximum | 13 |
| Zeros | 3712 |
| Zeros (%) | 45.5% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 13 |
| Range | 13 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.147111744 |
|---|---|
| Coefficient of variation (CV) | 1.266356793 |
| Kurtosis | 1.378141796 |
| Mean | 1.695503002 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.348335868 |
| Sum | 13837 |
| Variance | 4.610088843 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 3712 | 45.5% | |
| 1 | 1157 | 14.2% | |
| 2 | 948 | 11.6% | |
| 3 | 758 | 9.3% | |
| 4 | 599 | 7.3% | |
| 5 | 399 | 4.9% | |
| 6 | 266 | 3.3% | |
| 7 | 167 | 2.0% | |
| 8 | 84 | 1.0% | |
| 9 | 45 | 0.6% | |
| Other values (3) | 26 | 0.3% |
| Value | Count | Frequency (%) | |
| 0 | 3712 | 45.5% | |
| 1 | 1157 | 14.2% | |
| 2 | 948 | 11.6% | |
| 3 | 758 | 9.3% | |
| 4 | 599 | 7.3% |
| Value | Count | Frequency (%) | |
| 13 | 2 | < 0.1% | |
| 11 | 11 | 0.1% | |
| 10 | 13 | 0.2% | |
| 9 | 45 | 0.6% | |
| 8 | 84 | 1.0% |
| Distinct | 30 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 510 |
| Missing (%) | 6.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.328323095 |
|---|---|
| Minimum | -3 |
| Maximum | 28 |
| Zeros | 3 |
| Zeros (%) | < 0.1% |
| Memory size | 63.8 KiB |
Quantile statistics
| Minimum | -3 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 8 |
| Q3 | 12 |
| 95-th percentile | 18 |
| Maximum | 28 |
| Range | 31 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 5.70074244 |
|---|---|
| Coefficient of variation (CV) | 0.6845006341 |
| Kurtosis | -0.7480917592 |
| Mean | 8.328323095 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.2820637171 |
| Sum | 63720 |
| Variance | 32.49846436 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 1934 | 23.7% | |
| 8 | 537 | 6.6% | |
| 9 | 526 | 6.4% | |
| 7 | 524 | 6.4% | |
| 10 | 469 | 5.7% | |
| 11 | 460 | 5.6% | |
| 6 | 451 | 5.5% | |
| 12 | 368 | 4.5% | |
| 13 | 356 | 4.4% | |
| 14 | 311 | 3.8% | |
| Other values (20) | 1715 | 21.0% | |
| (Missing) | 510 | 6.2% |
| Value | Count | Frequency (%) | |
| -3 | 1 | < 0.1% | |
| 0 | 3 | < 0.1% | |
| 1 | 1934 | 23.7% | |
| 2 | 12 | 0.1% | |
| 3 | 54 | 0.7% |
| Value | Count | Frequency (%) | |
| 28 | 1 | < 0.1% | |
| 27 | 1 | < 0.1% | |
| 26 | 2 | < 0.1% | |
| 25 | 6 | 0.1% | |
| 24 | 10 | 0.1% |
URBANICITY
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.8 KiB |
| Highly Urban/ Urban | |
|---|---|
| z_Highly Rural/ Rural |
| Value | Count | Frequency (%) | |
| Highly Urban/ Urban | 6492 | 79.5% | |
| z_Highly Rural/ Rural | 1669 | 20.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 21 |
|---|---|
| Median length | 19 |
| Mean length | 19.4090185 |
| Min length | 19 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| INDEX | TARGET_FLAG | TARGET_AMT | KIDSDRIV | AGE | HOMEKIDS | YOJ | INCOME | PARENT1 | HOME_VAL | MSTATUS | SEX | EDUCATION | JOB | TRAVTIME | CAR_USE | BLUEBOOK | TIF | CAR_TYPE | RED_CAR | OLDCLAIM | CLM_FREQ | REVOKED | MVR_PTS | CAR_AGE | URBANICITY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 0.0 | 0 | 60.0 | 0 | 11.0 | $67,349 | No | $0 | z_No | M | PhD | Professional | 14 | Private | $14,230 | 11 | Minivan | yes | $4,461 | 2 | No | 3 | 18.0 | Highly Urban/ Urban |
| 1 | 2 | 0 | 0.0 | 0 | 43.0 | 0 | 11.0 | $91,449 | No | $257,252 | z_No | M | z_High School | z_Blue Collar | 22 | Commercial | $14,940 | 1 | Minivan | yes | $0 | 0 | No | 0 | 1.0 | Highly Urban/ Urban |
| 2 | 4 | 0 | 0.0 | 0 | 35.0 | 1 | 10.0 | $16,039 | No | $124,191 | Yes | z_F | z_High School | Clerical | 5 | Private | $4,010 | 4 | z_SUV | no | $38,690 | 2 | No | 3 | 10.0 | Highly Urban/ Urban |
| 3 | 5 | 0 | 0.0 | 0 | 51.0 | 0 | 14.0 | NaN | No | $306,251 | Yes | M | <High School | z_Blue Collar | 32 | Private | $15,440 | 7 | Minivan | yes | $0 | 0 | No | 0 | 6.0 | Highly Urban/ Urban |
| 4 | 6 | 0 | 0.0 | 0 | 50.0 | 0 | NaN | $114,986 | No | $243,925 | Yes | z_F | PhD | Doctor | 36 | Private | $18,000 | 1 | z_SUV | no | $19,217 | 2 | Yes | 3 | 17.0 | Highly Urban/ Urban |
| 5 | 7 | 1 | 2946.0 | 0 | 34.0 | 1 | 12.0 | $125,301 | Yes | $0 | z_No | z_F | Bachelors | z_Blue Collar | 46 | Commercial | $17,430 | 1 | Sports Car | no | $0 | 0 | No | 0 | 7.0 | Highly Urban/ Urban |
| 6 | 8 | 0 | 0.0 | 0 | 54.0 | 0 | NaN | $18,755 | No | NaN | Yes | z_F | <High School | z_Blue Collar | 33 | Private | $8,780 | 1 | z_SUV | no | $0 | 0 | No | 0 | 1.0 | Highly Urban/ Urban |
| 7 | 11 | 1 | 4021.0 | 1 | 37.0 | 2 | NaN | $107,961 | No | $333,680 | Yes | M | Bachelors | z_Blue Collar | 44 | Commercial | $16,970 | 1 | Van | yes | $2,374 | 1 | Yes | 10 | 7.0 | Highly Urban/ Urban |
| 8 | 12 | 1 | 2501.0 | 0 | 34.0 | 0 | 10.0 | $62,978 | No | $0 | z_No | z_F | Bachelors | Clerical | 34 | Private | $11,200 | 1 | z_SUV | no | $0 | 0 | No | 0 | 1.0 | Highly Urban/ Urban |
| 9 | 13 | 0 | 0.0 | 0 | 50.0 | 0 | 7.0 | $106,952 | No | $0 | z_No | M | Bachelors | Professional | 48 | Commercial | $18,510 | 7 | Van | no | $0 | 0 | No | 1 | 17.0 | z_Highly Rural/ Rural |
Last rows
| INDEX | TARGET_FLAG | TARGET_AMT | KIDSDRIV | AGE | HOMEKIDS | YOJ | INCOME | PARENT1 | HOME_VAL | MSTATUS | SEX | EDUCATION | JOB | TRAVTIME | CAR_USE | BLUEBOOK | TIF | CAR_TYPE | RED_CAR | OLDCLAIM | CLM_FREQ | REVOKED | MVR_PTS | CAR_AGE | URBANICITY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8151 | 10291 | 0 | 0.0 | 0 | 54.0 | 0 | 13.0 | $81,818 | No | $272,725 | Yes | M | Bachelors | Manager | 18 | Commercial | $19,660 | 1 | Van | no | $24,690 | 1 | Yes | 6 | 4.0 | Highly Urban/ Urban |
| 8152 | 10292 | 0 | 0.0 | 1 | 46.0 | 0 | 12.0 | $45,018 | No | $0 | z_No | M | z_High School | z_Blue Collar | 26 | Private | $15,060 | 4 | Minivan | no | $33,026 | 3 | No | 0 | 1.0 | z_Highly Rural/ Rural |
| 8153 | 10293 | 0 | 0.0 | 0 | 48.0 | 0 | 10.0 | $111,305 | No | $0 | z_No | z_F | PhD | Doctor | 59 | Private | $17,430 | 13 | z_SUV | no | $0 | 0 | No | 4 | 18.0 | Highly Urban/ Urban |
| 8154 | 10295 | 0 | 0.0 | 1 | 38.0 | 4 | 16.0 | $12,717 | No | $0 | Yes | z_F | Bachelors | Student | 15 | Commercial | $24,740 | 1 | Pickup | no | $9,245 | 3 | No | 3 | 15.0 | Highly Urban/ Urban |
| 8155 | 10296 | 0 | 0.0 | 0 | 41.0 | 0 | 7.0 | $6,256 | No | $0 | z_No | M | z_High School | Student | 41 | Private | $5,600 | 1 | Pickup | no | $0 | 0 | No | 0 | 7.0 | z_Highly Rural/ Rural |
| 8156 | 10297 | 0 | 0.0 | 0 | 35.0 | 0 | 11.0 | $43,112 | No | $0 | z_No | M | z_High School | z_Blue Collar | 51 | Commercial | $27,330 | 10 | Panel Truck | yes | $0 | 0 | No | 0 | 8.0 | z_Highly Rural/ Rural |
| 8157 | 10298 | 0 | 0.0 | 1 | 45.0 | 2 | 9.0 | $164,669 | No | $386,273 | Yes | M | PhD | Manager | 21 | Private | $13,270 | 15 | Minivan | no | $0 | 0 | No | 2 | 17.0 | Highly Urban/ Urban |
| 8158 | 10299 | 0 | 0.0 | 0 | 46.0 | 0 | 9.0 | $107,204 | No | $332,591 | Yes | M | Masters | NaN | 36 | Commercial | $24,490 | 6 | Panel Truck | no | $0 | 0 | No | 0 | 1.0 | Highly Urban/ Urban |
| 8159 | 10301 | 0 | 0.0 | 0 | 50.0 | 0 | 7.0 | $43,445 | No | $149,248 | Yes | z_F | Bachelors | Home Maker | 36 | Private | $22,550 | 6 | Minivan | no | $0 | 0 | No | 0 | 11.0 | Highly Urban/ Urban |
| 8160 | 10302 | 0 | 0.0 | 0 | 52.0 | 0 | 11.0 | $53,235 | No | $197,017 | Yes | z_F | z_High School | Clerical | 64 | Private | $19,400 | 6 | Minivan | no | $0 | 0 | No | 0 | 9.0 | z_Highly Rural/ Rural |